Skip to content

Era-aware AI vocabulary breakdown + speculative gap-filling pattern#111

Closed
philippdubach wants to merge 1 commit into
blader:mainfrom
philippdubach:era-vocab-and-gap-filling
Closed

Era-aware AI vocabulary breakdown + speculative gap-filling pattern#111
philippdubach wants to merge 1 commit into
blader:mainfrom
philippdubach:era-vocab-and-gap-filling

Conversation

@philippdubach
Copy link
Copy Markdown
Contributor

Summary

Two narrowly scoped updates sourced from the current revision of Wikipedia: Signs of AI writing (revision fetched 2026-05-01).

  • §7 (AI Vocabulary): Replaces the flat high-frequency word list with the era-specific clusters now documented on the wiki page (GPT-4 / GPT-4o / GPT-5 eras). Adds bolstered and meticulous/meticulously to the master list, plus a one-line caveat about literal vs figurative usage (e.g., underscore as a literal underline, delve in geology).
  • §21 (renamed to "Knowledge-Cutoff Disclaimers and Speculative Gap-Filling"): Covers the newer retrieval-augmented pattern where a model, having failed to find a source, writes a paragraph about not having found one and then speculates that the subject "maintains a low profile" or "keeps personal details private." Adds a second before/after example for the gap-filling case.
  • README: Tightens the §21 row label to reflect both subpatterns.

No new patterns; pattern count stays at 29. No version bump — happy to defer that to whatever coordination you do with the open v2.6.0 PRs (#85, #98).

Test plan

  • Diff is two files; SKILL.md and README.md
  • §7 keeps its existing Before/After example unchanged
  • §21 keeps its existing Before/After example as the cutoff-disclaimer case, and adds a separate gap-filling Before/After
  • Pattern numbering and section anchors are unchanged
  • Skill loads in Claude Code with no parse errors

Source: Wikipedia:Signs of AI writing — see "High density of AI vocabulary words" and "Knowledge-cutoff disclaimers and speculation about gaps in sources" sections.

…tern

Two changes sourced from Wikipedia: Signs of AI writing (revision
fetched 2026-05-01).

§7 (AI Vocabulary): replace the flat high-frequency word list with
the era-specific clusters now documented on the wiki page (GPT-4 /
GPT-4o / GPT-5 eras). Add 'bolstered' and 'meticulous/meticulously'
to the master list, and a one-line caveat about literal vs
figurative usage.

§21 (renamed to "Knowledge-Cutoff Disclaimers and Speculative
Gap-Filling"): cover the newer retrieval-augmented pattern where
the model, having failed to find a source, writes a paragraph about
not having found one and then speculates that the subject
"maintains a low profile" or "keeps personal details private."
Adds a second before/after example for the gap-filling case.

README: tighten the §21 row label to reflect both subpatterns.

No version bump (leaving that to the maintainer to coordinate with
the open v2.6.0 PRs). No new patterns; pattern count stays at 29.
philippdubach added a commit to philippdubach/humanizer that referenced this pull request May 1, 2026
Brings the fork's main branch in line with the maintained local
v2.6.0, consolidating the changes that are also opened as focused
PRs against blader/humanizer (blader#111, blader#112, blader#113):

- §7 expanded with era-specific AI vocabulary clusters (GPT-4 /
  GPT-4o / GPT-5 eras), plus 'bolstered' and 'meticulous' added to
  the master list and a literal-vs-figurative caveat.
- §21 renamed to "Knowledge-Cutoff Disclaimers and Speculative
  Gap-Filling"; covers the retrieval-augmented "maintains a low
  profile" / "keeps personal details private" speculation pattern.
- New patterns §30-34: reference-markup artifacts (turn0search0,
  oaicite, utm_source=chatgpt.com, etc.), placeholder leftovers,
  Markdown/wikitext contamination, formal "Conclusion" closers,
  didactic disclaimers.
- New Detection Guidance group: what NOT to flag (false positives),
  signs of human writing to preserve, and per-model LLM idiolects.

Frontmatter version bumped to 2.6.0. README pattern table updated
(29 → 34 patterns) with a new Artifacts and Contamination section
and a pointer to Detection Guidance. WARP.md count corrected from
the stale "25 patterns" to 34.

Sourced from Wikipedia: Signs of AI writing (revision fetched
2026-05-01).
duathron added a commit to duathron/humanizer-ext that referenced this pull request May 22, 2026
- Add DETECTION GUIDANCE section (false positives, human-writing
  signs, LLM idiolects) so editors know what NOT to flag (PR blader#113)
- Add Tier-1 AI-iness density pre-flight in Full mode; auto-drops to
  Quick when density = 0 to protect human-first drafts (PR blader#115 adapted)
- Expand blader#7 with era-specific vocabulary clusters (GPT-4 / GPT-4o /
  GPT-5 eras) and figurative-vs-literal caveat (PR blader#111)
- Expand blader#9 with "rather than" dismissals + on-the-table test (PR blader#85)
- Expand blader#14 with paired em dash bracketing + 4 fix options (PR blader#85)
- Expand blader#21 with speculative gap-filling ("maintains a low profile"
  template detection) (PR blader#111)
- Expand blader#23 with three more didactic disclaimers (subsumes pattern
  34 from PR blader#112)
- Expand blader#25 with structural "## Conclusion" section note
- Add pattern blader#35 Debunking-Pose Headings -- heading-level AI tells
  that slip through prose-only passes (PR blader#116)
- Add patterns blader#36 Conditional Frame Stacking and blader#37 Miscalibrated
  Epistemic Confidence (PR blader#85)
- Add patterns blader#38 Reference-Markup Artifacts, blader#39 Phrasal Templates /
  Placeholder Text, and blader#40 Markdown / Wikitext Contamination --
  three chat-UI copy-paste tells that confirm AI involvement (PR blader#112)
- Extend domain overrides for blader#35-37; blader#38-40 are universal
- Extend final AI audit from 9 to 13 points
- README: pattern count 34 -> 40, three new section rows, updated
  fork-differentiator table, 3.2.0 version-history entry

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
blader added a commit that referenced this pull request May 27, 2026
- §14: turn em-dash "overuse" into a hard cut (no em or en dashes in the
  final rewrite), with a replacement ladder and a final scan. Idea from #96.
- §21: expand to cover speculative gap-filling ("maintains a low profile,"
  "keeps personal details private") where a model invents filler instead of
  saying a source is missing. Idea from #111.
- New pattern #30, diff-anchored writing: describe the thing as it is, not as
  a narration of what changed. Idea from #81.

Hand-ported lean versions rather than merging the source PRs. 30 patterns
total; README and AGENTS.md updated to match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@blader
Copy link
Copy Markdown
Owner

blader commented May 27, 2026

Adopted the speculative gap-filling half in v2.7.0 — §21 now covers 'maintains a low profile / keeps personal details private' as unsourced filler. Left out the era-vocabulary taxonomy on purpose (it's forensic dating that ages quickly; we just removed a similar model-fingerprinting section). Thanks for surfacing the gap-filling tell.

@blader blader closed this May 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants